Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Incremental learning based proactive caching mechanism for RocksDB key-value system
Keyun LUO, Baoliu YE, Bin TANG, Feng MEI, Wenda LU
Journal of Computer Applications    2020, 40 (2): 321-327.   DOI: 10.11772/j.issn.1001-9081.2019091616
Abstract407)   HTML2)    PDF (723KB)(356)       Save

RocksDB key-value storage system based on Log-Structured Merge (LSM) tree has the problem of low read performance caused by the constraints of its hierarchical structure. One effective solution is to cache hot spot data proactively, but it faces two challenges. One is how to predict the hot spot data when the data distribution keeps on changing constantly, the other is how to integrate the proactive caching mechanism with the RocksDB storage structure. To tackle these challenges, a proactive caching framework for RocksDB key-value system with multiple components including data collection, system interaction and system evaluation was built, which can cache the hot spot data at the low levels of the LSM tree. And with the modeling of data access patterns, an incremental learning based prediction analysis method for hot spot data was designed and implemented, which can reduce the number of I/O operations of storage medium. Experimental results show that the proposed mechanism can effectively improve the read performance of RocksDB under different dynamic workloads.

Table and Figures | Reference | Related Articles | Metrics
High performance key-value storage system based on remote direct memory access
Cheng WANG, Baoliu YE, Feng MEI, Wenda LU
Journal of Computer Applications    2020, 40 (2): 316-320.   DOI: 10.11772/j.issn.1001-9081.2019091635
Abstract380)   HTML4)    PDF (613KB)(565)       Save

With the continuous increment of data and system size, network communication becomes a performance bottleneck of key-value storage systems. Meanwhile, Remote Direct Memory Access (RDMA) technique can support high bandwidth, low latency data transmission, which provides a new idea for designing key-value storage systems. Based on RDMA technique in the high performance network, a key-value storage system named Chequer with high performance and low CPU overhead was designed and implemented. By combining the characteristics of RDMA primitives, the basic operation workflow of key-value storage system was redesigned. And a linear probing based shared hash table was designed to reduce the number of client reading rounds by solving the problem of client cache invalidation as well as increasing the hash hit rate, which can further improve the performance of the system. The Chequer system was implemented on the small-scale cluster, and its performance was demonstrated by experiments.

Table and Figures | Reference | Related Articles | Metrics
Efficient storage scheme for deadline aware distributed matrix multiplication
Yongzhu ZHAO, Weidong LI, Bin TANG, Feng MEI, Wenda LU
Journal of Computer Applications    2020, 40 (2): 311-315.   DOI: 10.11772/j.issn.1001-9081.2019091640
Abstract457)   HTML15)    PDF (742KB)(543)       Save

Distributed matrix multiplication is a fundamental operation in many distributed machine learning and scientific computing applications, but its performance is greatly influenced by the stragglers commonly existed in the systems. Recently, researchers have proposed a fountain code based coded matrix multiplication method, which can effectively mitigate the effect of stragglers by fully exploiting the partial results of stragglers. However, it lacks the consideration of the storage cost of worker nodes. By considering the tradeoff relationship between the storage cost and the finish time of computation, the computational deadline-aware storage optimization problem for heterogeneous worker nodes was proposed firstly. Then, through the theoretical analysis, the solution based on expectation approximation was presented, and the problem was transformed into a convex optimization problem by relaxation for efficient solution. Simulation results show that in the case of ensuring a large task success rate, the storage overhead of the proposed scheme will rapidly decrease as the task duration is relaxed, and the scheme can greatly reduce the storage overhead brought by encoding. In other words, the proposed scheme can significantly reduce the extra storage overhead while guaranteeing that the whole computation can be finished before the deadline with high probability.

Table and Figures | Reference | Related Articles | Metrics